Effects of environmental factors on spatio-temporal movement patterns of students

Project for Patterns and Trends in Environmental Data - Computational Movement Analysis

Author

Mirjam Scheib & Miriam Steinhauer

TO DOS

Map visualization schön mache Boxplots vereinheitlichen

Introduction

It is known that several factors like weather condition (Brum-Bastos et al., 2018; Guo et al., 2022), day of the week and time of the day (Liu et al., 2020; Sathishkumar et al., 2020) influence spatio-temporal movement patterns of people. There are several study’s investigating movement patterns of people in urban areas with the aim to improve the distribution of facilities and the provision of transportation services, as well as to manage traffic peaks (Kyaing et al., 2017; Liu et al., 2020). Nevertheless, most studies do not focus on specific groups (working people, students, elderlies), that might follow different spatio-temporal patterns, showing a potential demand for a specific adaption of the infrastructure. The canton of Zurich has the most students, with numbers continuing to rise. Also, the students make up a good proportion of the public transport commuter mass, which is why knowledge of spatio-temporal patterns of students could help to manage for example relief trains and other studying related infrastructure. Therefore, we want to investigate the influence of environmental factors on spatio-temporal movement patterns of students.

Looking at only a subset (students) might show different patterns, students make up a big part in ÖV (Bundesamt für Statistik, 2021) especially around a Studentenstadt wie Zürich… (Medienmitteilung KT ZH, 2021).

Packages

# clear space 
rm(list=ls())

# load packages 
library("readr")
library("dplyr")
library("ggplot2")
library("sf")
library("terra")
library("tmap")
library("gitcreds")
library("dplyr")
library("SimilarityMeasures")
library("lubridate")
library("plotly")

knitr::opts_chunk$set(echo = TRUE)
options(warning=FALSE)    # Don't show warnings
par(mfrow=c(1,1))         # Reset plot placement to normal 1 by 1

Results

Q1: Comparing spatio-temporal Movement between Weekdays and Weekends

Does the day of the week (weekend vs. workday) have an impact on spatio-temporal movement patterns? ### Spatial Analysis #### Load cleaned data

# load clean data 
posmo <- read_delim("posmo_data/posmo_trips.csv")

Segmentierte Daten nehmen - Speed & Distance

  • thought: How about removing certain transport modes (funicular, horse, Other?)
# Boxplot comparing Travelled Distance between weekend and weekday of travelmodes 
ggplot(posmo, aes(day_week, log(steplength), fill = day_week)) +
  geom_boxplot() +
  labs(title = "Travelled Distance per Transport Mode", subtitle = "Comparing Weekend vs. Weekday", fill = "Day of the Week") +
  ylab("Log Steplength [m]") +
  xlab(" ") +
  scale_fill_manual(values = c("weekday" = "lavenderblush3", "weekend" = "aquamarine3"), labels = c( "Weekday", "Weekend")) +
  facet_wrap(~transport_mode, nrow = 1) +
  theme_minimal()

# Boxplot comparing Speed between weekend and weekday of travelmodes 
ggplot(posmo, aes(day_week, log(speed), fill = day_week)) +
  geom_boxplot() +
  labs(title = "Speed per Transport Mode", subtitle = "Comparing Weekend vs. Weekday", fill = "Day of the Week") +
  ylab("Log Speed [m/s]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("weekday" = "lavenderblush3", "weekend" = "aquamarine3"), labels = c( "Weekday", "Weekend")) +
  facet_wrap(~transport_mode, nrow = 1) +
  theme_minimal()

# Table Summary 
question1_summary <- posmo |> 
  group_by(day_week, transport_mode) |> 
  summarise(steplength_mean = mean(steplength, na.rm = T),
            speed_mean = mean(speed, na.rm = T)) |> 
  mutate(percentage_steplength = steplength_mean/sum(steplength_mean)*100,
         percentage_speed = speed_mean/sum(speed_mean)*100)

Temporal Analysis

Duration

# Boxplot comparing trip duration between weekend and weekday of travelmodes 
ggplot() +
  geom_boxplot(data = posmo, aes(day_week, log(trip_duration/60), fill = day_week)) +
   labs(title = "Trip Duration per Transport Mode", subtitle = "Comparing Weekend vs. Weekday", fill = "Day of the Week") +
  ylab("Log Duration [min]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("weekday" = "lavenderblush3", "weekend" = "aquamarine3"), labels = c( "Weekday", "Weekend")) +
  facet_wrap(~transport_mode, nrow = 1) +
  theme_minimal()

# Boxplot comparing trip distance between weekend and weekday of travelmodes 
  ggplot() +
  geom_boxplot(data = posmo, aes(day_week, log(trip_dis), fill = day_week)) +
  labs(title = "Trip Distance per Transport Mode", subtitle = "Comparing Weekend vs. Weekday", fill = "Day of the Week") +
  ylab("Log Distance [m]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("weekday" = "lavenderblush3", "weekend" = "aquamarine3"), labels = c( "Weekday", "Weekend")) +
    facet_wrap(~transport_mode, nrow = 1) +
    theme_minimal()

Layout anpassen- Create Visual Maps

# transform posmo data into an sf object 
posmo <- st_as_sf(posmo, coords = c("X","Y"), crs = 2056) 

# 1. add grouping variable to the sf object
posmo_grouped <- group_by(posmo, dataset)

# 2. use summarise() to "dissolve" all point into a multipoint object
posmo_smry <- summarise(posmo_grouped)

# 3. run st_convex_hull()
mcp_posmo <- st_convex_hull(posmo_smry)
tmap_mode("view")

# segmented visualisation 
tm_shape(mcp_posmo) +
  tm_fill(col = "dataset", alpha = 0.4) +
  tm_legend(title = "User") +
  tm_shape(mcp_posmo) +
  tm_borders(col = "red") +
  tm_shape(posmo) +
  tm_dots(col = "day_week") 

Q2: Comparing spatio-temporal Movement over 24h

Analysis

Load cleaned data

# load clean data 
posmo <- read_delim("posmo_data/posmo_trips.csv")

Legende anschreiben - Analysis with segmented posmo data

# jetzt das noch mit segmentierten Daten 
# round datetime to 1h 
posmo_round <- posmo |>
  mutate(hour = lubridate::hour(datetime))

# create dataframe, which calculates mean steplength, speed per hour over all dates
# here with weekdays 
posmo_hour <- posmo_round |>
  group_by(hour, weekday) |>
  summarise(mean_dis = mean(steplength, na.rm = TRUE),
            mean_speed = mean(speed, na.rm = TRUE), 
            mean_duration = mean(trip_duration, na.rm = TRUE)) 

# mo-so with distance (steplength) travelled
ggplot(posmo_hour, aes(hour, mean_dis, col = weekday)) +
  geom_point() +
   geom_line(lwd = 0.7) +
  labs(title = "Mean Tavelled Distance over 24h", subtitle = "Compared from Monday to Sunday", color = "Day of the Week") +
  ylab("Mean Distance [m]") +
  xlab("Hour") +
  theme_minimal()

# mo-so with speed 
ggplot(posmo_hour, aes(hour, mean_speed, col = weekday)) +
  geom_point() +
  geom_line(lwd = 0.7) +
  labs(title = "Mean Speed over 24h", subtitle = "Compared from Monday to Sunday", color = "Day of the Week") +
  ylab("Mean Speed [m/s]") +
  xlab("Hour") +
  theme_minimal()

# mo-so with trip duration  
ggplot(posmo_hour, aes(hour, mean_duration/60, col = weekday)) +
  geom_point() +
  geom_line(lwd = 0.7) +
  labs(title = "Mean Trip Duration over 24h", subtitle = "Compared from Monday to Sunday", color = "Day of the Week") +
  ylab("Mean Duration [min]") +
  xlab("Hour") +
  theme_minimal()

# create dataframe, which calculates mean steplength, speed per hour over all dates
# here with days of the week (weekend vs. weekday)
posmo_day <- posmo_round |>
  group_by(hour, day_week)|>
  summarise(mean_dis = mean(steplength, na.rm = TRUE),
            mean_speed = mean(speed, na.rm = TRUE),
            mean_duration = mean(trip_duration, na.rm = TRUE)) 

# weekend vs. weekday with distance (steplength) travelled
ggplot(posmo_day, aes(hour, mean_dis, col = day_week)) +
  geom_point() +
  geom_line(lwd = 0.7) +
  labs(title = "Mean Tavelled Distance over 24h", subtitle = "Compared between Weekend vs. Weekday", color = "Weekend vs. Weekday") +
  ylab("Mean Distance [m]") +
  xlab("Hour") +
  theme_minimal()

# weekend vs. weekday with speed
ggplot(posmo_day, aes(hour, mean_speed, col = day_week)) +
  geom_point() +
  geom_line(lwd = 0.7) +
  labs(title = "Mean Speed over 24h", subtitle = "Compared between Weekend vs. Weekday", color = "Weekend vs. Weekday") +
  ylab("Mean Speed [m/s]") +
  xlab("Hour") +
  theme_minimal()

# weekend vs. weekday with trip duration 
ggplot(posmo_day, aes(hour, mean_duration/60, col = day_week)) +
  geom_point() +
    geom_line(lwd = 0.7) +
  labs(title = "Mean Trip Duration over 24h", subtitle = "Compared between Weekend vs. Weekday", color = "Weekend vs. Weekday") +
  ylab("Mean Duration [min]") +
  xlab("Hour") +
  scale_color_manual(values = c("weekday" = "lavenderblush3", "weekend" = "aquamarine3", labels = c("Weekday", "Weekend"))) +
  theme_minimal()

Q3: Comparing spatio-temporal Movement between Rainy and Dry Days

Spatial Analysis

Load cleaned data

# load clean data 
posmo <- read_delim("posmo_data/posmo_trips.csv")

Speed & Distance

# Boxplot compare rain/ no rain without grouping (overall steplength)
ggplot(posmo, aes(transport_mode, log(steplength), fill = rain_day)) +
  geom_boxplot() +
  labs(title = "Travelled Distance per Transport Mode", subtitle = "Comparing Rainy vs. Dry Days", fill = "Rain vs. Dry") +
  ylab("Log Steplength [m]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("no_rain" = "lavenderblush3", "rain" = "aquamarine3"), labels = c( "Dry", "Rain")) +
  theme_minimal()

# Boxplot compare rain/ no rain without grouping (overall speed)
ggplot(posmo, aes(transport_mode, log(speed), fill = rain_day)) +
  geom_boxplot() +
   labs(title = "Speed per Transport Mode", subtitle = "Comparing Rainy vs. Dry Days", fill = "Rain vs. Dry") +
  ylab("Log Speed [m/s]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("no_rain" = "lavenderblush3", "rain" = "aquamarine3"), labels = c( "Dry", "Rain")) +
  theme_minimal()

# Table Summary 
question2_summary <- posmo |> 
  group_by(rain_day, transport_mode) |> 
  summarise(steplength_mean = mean(steplength, na.rm = T),
            speed_mean = mean(speed, na.rm = T)) |> 
  mutate(percentage_steplength = steplength_mean/sum(steplength_mean)*100,
         percentage_speed = speed_mean/sum(speed_mean)*100)

Temporal Analysis

Load trips data

# load clean trips data 
posmo <- read_delim("posmo_data/posmo_trips.csv")

Duration

Mean duration of trips compared between Rain and Dry Days

# choose travelmodes which make sense (other and funicula excluding)
posmo <- posmo |>
  subset(transport_mode != "Other1" & transport_mode != "Funicular")

# check if subsetting worked 
unique(posmo$transport_mode)
[1] "Walk"  "Train" "Car"   "Tram"  "Bike"  "Bus"   "Horse"
# Calculate mean duration and mean dis of trips per day
# mean trip duration looks odd, need to check that (!!)
ggplot() +
  geom_boxplot(data = posmo, aes(transport_mode, trip_duration/60, fill = rain_day)) +
  labs(title = "Travelled Distance per Transport Mode", subtitle = "Comparing Rainy vs. Dry Days", fill = "Rain vs. Dry") +
  ylab("Duration [min]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("no_rain" = "lavenderblush3", "rain" = "aquamarine3"), labels = c( "Dry", "Rain")) +
  theme_minimal()

# mean trip distance compared between rain/no rain
ggplot() +
  geom_boxplot(data = posmo, aes(transport_mode, log(trip_dis), fill = rain_day)) +
    labs(title = "Speed per Transport Mode", subtitle = "Comparing Rainy vs. Dry Days", fill = "Rain vs. Dry") +
  ylab("Log Distance [m]") +
  xlab("Transport Mode") +
  scale_fill_manual(values = c("no_rain" = "lavenderblush3", "rain" = "aquamarine3"), labels = c( "Dry", "Rain")) +
  theme_minimal()

Legende anschreiben - Create Visual Maps

# transform posmo data into an sf object 
posmo <- st_as_sf(posmo, coords = c("X","Y"), crs = 2056) 


# 1. add grouping variable to the sf object
posmo_grouped <- group_by(posmo, dataset)

# 2. use summarise() to "dissolve" all point into a multipoint object
posmo_smry <- summarise(posmo_grouped)

# 3. run st_convex_hull()
mcp_posmo <- st_convex_hull(posmo_smry)
tmap_mode("view")

# segmented visualisation 
tm_shape(mcp_posmo) +
  tm_fill(col = "dataset", alpha = 0.4) +
  tm_shape(mcp_posmo) +
  tm_borders(col = "red") +
  tm_shape(posmo) +
  tm_dots(col = "rain_day") 

Literature

  • Brum-Bastos, V. S., Long, J. A., & Demšar, U. (2018). Weather effects on human mobility: A study using multi-channel sequence analysis. Computers, Environment and Urban Systems, 71, 131–152. https://doi.org/10.1016/j.compenvurbsys.2018.05.004

  • Bundesamt für Statistik. (2021). Pendlermobilität. Bundesamt für Statistik. https://www.bfs.admin.ch/bfs/de/home/statistiken/mobilitaet-verkehr/personenverkehr/pendlermobilitaet.html

  • Guo, P., Sun, Y., Chen, Q., Li, J., & Liu, Z. (2022). The Impact of Rainfall on Urban Human Mobility from Taxi GPS Data. Sustainability, 14(15), Article 15. https://doi.org/10.3390/su14159355

  • Halsey, L. G. (2016). Terrestrial movement energetics: Current knowledge and its application to the optimising animal. Journal of Experimental Biology, 219(10), 1424–1431.

  • Kyaing, K., Lwin, K., & Sekimoto, Y. (2017). Human mobility patterns for different regions in Myanmar based on CDRs data. IPTEK Journal of Proceedings Series, 3(6).

  • Liu, X., Sun, L., Sun, Q., & Gao, G. (2020). Spatial Variation of Taxi Demand Using GPS Trajectories and POI Data. Journal of Advanced Transportation, 2020, e7621576. https://doi.org/10.1155/2020/7621576

  • Medienmitteilung KT ZH. (2021). Bildung in Zahlen. Kanton Zürich. https://www.zh.ch/de/news-uebersicht/medienmitteilungen/2021/07/bildung-in-zahlen.html

  • Sathishkumar, Cho, Y., & Jangwoo, Park. (2020). Seoul bike trip duration prediction using data mining techniques. IET Intelligent Transport Systems, 14(11), 1465–1474. https://doi.org/10.1049/iet-its.2019.0796